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1.  Introduction 


The  analysis  of  atirospheric  moisture  remains  a  challenging  problem  in 
data  assimilation  today.  Relative  humidity  (RH)  varies  on  horizontal  scales 
that  are  smaller  than  the  typical  separation  distance  of  radiosondes  over 
Northern  Hemisphere  continents,  and  there  are  virtually  no  radiosonde  measure¬ 
ments  over  large  portiors  of  the  Southern  Hemisphere  and  the  Northern  Hemi¬ 
sphere  oceans.  Satellite  observations  may  help  to  improve  our  knowledge  of 
the  RH  field. 

Moisture  information  may  be  obtained  from  satellite  sounding  data  by  re¬ 
trieval  techniques  similar  to  those  used  for  temperature  retrievals,  or  by  use 
of  cloudiness  information  contained  in  satellite  imagery.  Since  moisture  (and 
temperature)  retrieval,  at  least  from  infrared  radiances,  is  difficult  in 
cloudy  atmospheres,  there  is  a  possibility  that  inference  of  humidity  data 
from  cloudiness  information  may  successfully  supplement  moisture  retrieval 
from  radiance  data.  Here 'we  study  the  inference  of  humidity  profiles  from 
cloud  data,  using  the  3DNEPH  data  base  as  the  source  of  the  cloudiness  inform¬ 
ation.  The  3DNEPH  (now  RTNEPH)  is  a  high  resolution  cloud  data  base  produced 
operationally  by  the^iJS  Air  Force  Global  Weather  Central,  fAFGWC)’.-  A  coloca¬ 
tion  study  of  cloud  data  with  radiosonde  measurements  of  relative  humidity  is 
used  to  develop  and  test  a  statistical  method  for  inferring  humidity  profiles; 
a  global  data  impact  study  is  used  to  assess  the  utility  of  this  moisture  in¬ 
formation.  In  this  report  we  review  some  of  the  ‘previously  developed  methods 
for  inferring  humidity  from  cloud  cover  data,  describe  the  data  base  and  pro¬ 
cessing  used  in  our  colocation  study,  and  discuss  the  development  and  testing 
of  the  new  method  for  inferring  humidity.  We  then  describe  the  data  impact 
test  and  summarize  our  results  and  conclusions. 

2 .  Background 

Several  different  techniques  for  inferring  relative  humidity  from  cloud 
information  exist.  They  may  be  grouped  into  two  categories:  level -by- level 
approaches,  which  use  relationships  between  relative  humidity  and  cloud  cover 
at  a  particular  level  or  layer  of  the  atmosphere,  and  profile  approaches, 
which  infer  vertical  profiles  of  relative  humidity  from  cloud  information. 
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Of  the  level -by- level  approaches,  the  one  most  widely  used  is  described 
in  Chu  and  Parrish  (1977).  As  implemented  by  Tibaldi  (1982),  humidity  is  de¬ 
termined  in  the  boundary  layer  (assumed  to  be  50  hPa  thick)  and  three  layers 
in  the  troposphere  between  the  boundary  layer  and  300  hPa  from  cloudiness  ob¬ 
servations  (both  from  surface  observers  and  satellite  observations),  using  a 
relationship  of  the  form: 

RH  -  M  -  A  cos  (it  •  ij)  ,  (1) 

where  RH  is  the  relative  humidity,  t)  the  fractional  cloud  cover,  and  the  coef¬ 
ficients  M  and  A  depend  on  the  humidity  layer  (see  Appendix  A). 

The  AFGWC  uses  humidity  -  cloudiness  relationships  within  its  3DNEPH  (now 
RTNEPH)  analysis  program.  As  described  in  Fye  (1978),  relative  humidity  is 
expressed  in  terms  of  a  condensation  pressure  spread  (CPS),  which  is  the  pres¬ 
sure  increment  an  air  parcel  needs  to  be  lifted  to  reach  saturation.  The  con¬ 
densation  spread  is  then  related  to  cloud  cover  by  an  empirically  derived 
curve  for  each  mandatory  pressure  level  (see  Appendix  A). 

Rasmussen  (1982)  used  3DNEPH  data  to  derive  multiple  regression  equations 
relating  mandatory  level  relative  humidity  to  cloud  information.  Separate  re¬ 
gression  equations  were  used  for  the  different  mandatory  levels. 

Relationships  between  relative  humidity  and  cloud  cover  are  also  used 
within  the  interactive  radiation  parameterizatiors  of  some  prediction 
models.  For  example,  Geleyn  (1981)  describes  one  such  relationship  as 

RH  -  RH  2 

rj  -  max [ 0 ,  J—  rh  1  (2) 

c 

where  the  critical  relative  humidity  RHC  depends  on  pressure  (see 
Appendix  A) .  The  above  relationship  is  easily  inverted  to  obtain  bogus  humid¬ 
ity  from  cloudiness. 

Saito  and  Baba  (1988)  investigated  level -by- level  approaches  in  a  coloca¬ 
tion  study  over  the  western  Pacific  Ocean.  The  cloud  cover  data  was  derived 
from  infrared  imagery  observed  by  the  GMS  satellite.  They  arrived  at  a  modi¬ 
fied  form  of  equation  (2),  in  which  a  small  but  nonzero  cloud  cover  was  al¬ 
lowed  even  for  subcritical  relative  humidities. 
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Norquist  (1988)  investigated  several  of  these  level -by- level  approaches 
in  a  global  colocation  study  using  3DNEPH  data.  Based  on  a  comparison  of  the 
Tibaldi,  AFGWC,  and  inverse  ECMWF  schemes,  he  found  the  Tibaldi  method  to  have 
the  smallest  errors.  He  demonstrated  the  potential  usefulness  of  bogus  RH 
data  inferred  from  3DNEPH  cloud  cover  data  with  data  assimilation  experiments 
in  which  the  bogus  RH  was  used  to  replace  RH  data  measured  by  radiosondes. 

A  potential  problem  with  all  of  these  level -by- level  approaches  is  the 
rather  large  uncertainty  of  the  cloud  height  assignment  in  most  cloud  cover 
data,  which  will  lead  to  errors  in  cloud  cover  -  relative  humidity  relation¬ 
ships.  In  addition,  cloud  cover  at  a  given  level  is  related  to  moisture  (end 
other  atmospheric  variables)  at  other  levels  in  some  meteorological  situa¬ 
tions,  most  notably  for  the  case  of  convective  clouds.  This  problem  is  ad¬ 
dressed  by  the  profile  approaches,  which  attempt  to  retrieve  an  entire  vertic¬ 
al  profile  of  moisture  from  cloud  cover  data. 

An  example  of  the  profile  approach  is  the  technique  used  at  the  Japanese 
Meteorological  Service  and  described  in  Kanamitsu  (1984).  Cloud  information 
including  cloud  cover,  variability  of  cloud  cover  and  cloud  top  are  used  to 
identify  one  of  60  different  categories,  each  of  which  is  associated  with  a 
typical  relative  humidity  profile.  The  categories  are  defined  a  priori. 

A  similar  approach  was  used  by  Mills  and  Davidson  (personal  communica¬ 
tion,  1987),  who  used  total  cloud  cover,  the  variance  of  cloud  top  tempera¬ 
ture,  and  the  height  of  maximum  cloudiness  to  define  categories  with  typical 
RH  profiles. 


The  aim  of  our  regression  study  is  to  develop  statistical  methods  to  in¬ 
fer  RH  profiles  from  the  3DNEPH  cloud  data  base.  We  used  the  empirical  ortho¬ 
gonal  functions  (EOFs)  of  RH  to  determine  the  important  features  of  the  ob¬ 
served  RH  profiles.  The  EOF  coefficients  were  related  to  colocated  cloud  data 
by  means  of  multivariate  linear  regression  equations.  The  data  used  in  the 
regression  study  was  restricted  to  the  North  American  continent,  resulting  in 
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3.  Data  Base  and  Processing 


The  data  base  used  in  the  colocation  study  consists  of  radiosonde  and 
cloudiness  data  over  North  America  during  February  and  June  of  1979. 

The  radiosonde  data  were  extracted  from  the  final,  reprocessed  FGGE  II  b 
data  set,  and  subjected  to  an  additional  quality  control.  Both  mandatory  and 
significant  level  data  were  used.  A  total  of  2699  (2235)  ;oundings,  for  97 
stations  were  extracted  from  the  FGGE  data  set  for  00  UTC,  5  February  through 
00  UTC,  22  February  (00  UTC,  14  June  through  00  UTC,  28  June).  The  location 
of  the  radiosonde  stations  are  shown  in  Fig.  1.  A  preliminary  investigation 
using  mandatory  pressure  levels  revealed  a  large  number  of  missing  values  at 
1000  hPa  and  850  hPa,  and  a  dependence  of  the  RH  profile,  when  defined  with 
respect  to  pressure  levels,  on  station  elevation.  For  these  reasons,  the 
relative  humidity  data  were  interpolated  from  the  pressure  levels  to  a 
terrain-following  coordinate  system,  and  an  EOF  analysis  was  performed  on  the 
result.  The  coordinate  system  was  a  modified  sigma  coordinate: 


a  -  (p-pt)/(ps-pt) ,  where  pt  -  300  hPa  and  pg  -  surface  pressure. 


A  total  of  15  equally  spaced  levels  were  used  between  a- 1  and  cr-0.15.  Only 
soundings  with  enough  measurements  to  allow  this  interpolation  were  used  in 
the  EOF  computations;  this  reduced  the  total  number  of  soundings  to  1918 
(1724)  for  February  (June).  Given  the  EOFs ,  any  relative  humidity  profile 
(RH^,  k  -  1,15)  may  be  expressed  in  terms  of  an  EOF  expansion  as  follows 


RH,  -  RH,  + 
k  k 


''m ,  k 


m-1 


where  is  the  climatological  mean  RH  profile,  em  the  coefficient  of  the  m1"*1 
EOF,  and  Em  ^  the  value  of  the  mC^  EOF  at  level  k.  We  computed  the  EOFs  as 
the  eigenvectors  of  the  relative  humidity  covariance  matrix.  Two  separate 
sets  of  EOFs  were  computed  for  each  month,  one  (denoted  EOF-All)  based  on  the 
covariance  matrix  about  the  mean  of  all  radiosondes  in  the  extracted  data  set 
for  each  month,  another  (denoted  EOF-Bands)  based  on  the  covariance  matrix 
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about  the  mean  humidity  profiles  computed  separately  for  three  latitude  bands 
of  width  10°,  spanning  20°N  to  50°N.  The  mean  profiles  and  associated  stand¬ 
ard  deviations  are  shown  in  Fig.  2  for  EOF-All,  and  Fig.  3  for  EOF-Bands.  Ex¬ 
cept  for  the  February  means,  differences  between  the  band  statistics  are 
small.  The  profiles  of  the  first  three  EOFs  are  shown  in  Fig.  4  and  5  for 
EOF-All  and  EOF-Bands,  respectively.  The  leading  EOFs  are  similar  for  Feb¬ 
ruary  and  June;  EOF-All  and  EOF-Bands  have  basically  the  same  structure.  The 
amount  of  variance  explained  by  these  EOFs  is  shown  in  Table  1.  In  comparing 
numbers  for  EOF-All  and  EOF-Bands  one  must  bear  in  mind  that  the  total  amount 

of  variance  about  the  mean  is  somewhat  smaller  for  EOF-Bands  than  EOF-All. 

Table  1:  Percent  of  variance  explained  by  the  RH  EOFs.  EOF-All  and  EOF-Bands 
refer  to  the  EOFs  defined  with  respect  to  latitude  -  independent  and 
latitude-dependent  mean  RH  profiles,  respectively. 


|  February 

|  June 

EOF  No. 

1  EOF-All 

EOF-Bands 

1  EOF-All 

EOF -  Bands 

1 

|  54.4 

1 

|  51.5 

|  44.6 

1  45.1 

2 

|  22.6 

1  23.4 

|  21.6 

|  21.1 

3 

|  8.0 

1  8-7 

|  10.4 

1  10.5 

4 

|  4.9 

|  5.3 

1  7.0 

|  6.7 

5 

j  2.9 

|  3.2 

)  4.4 

|  4.5 

6 

1  1.9 

1  2.1 

|  2.8 

|  2.9 

7 

1  1.3 

1  1.4 

|  2.2 

|  2.3 

8 

1  1.0 

1  1-1 

1  1.7 

1  I-7 

9 

|  0.8 

|  0.8 

1  1-2 

1  1-2 

10 

|  0.6 

|  0.6 

1  l.o 

|  1.0 

11 

|  0.4 

|  0.5 

|  0.8 

|  0.8 

12 

|  0.4 

|  0.4 

1  0.7 

|  0.7 

13 

|  0.3 

|  0.4 

|  0.6 

|  0.6 

14 

|  0.3 

|  0.4 

|  0.5 

1  0.5 

15 

|  0.3 

|  0.3 

|  0.3 

|  0.3 

The  cloudiness  data  base  used  for  this  study  is  the  Air  Force  3DNEPH 
analysis.  The  3DNEPH  data  set  is  a  global  gridded  data  set  with  a  resolution 
of  47.6  km  on  a  polar  stereographic  grid  (the  so-called  8L  -mesh  grid).  The 
grid  for  each  hemisphere  is  subdivided  into  64  boxes,  each  of  which  contains 
64x64  gridpoints.  The  data  at  each  gridpoint  consist  of  percent  cloud  cover 
for  total  sky  cover  and  for  15  layers  in  the  vertical,  as  well  as  several 
other  parameters  (cloud  type,  base  and  top  heights,  terrain  height,  and  pres- 
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ent  weather;  see  Appendix  B  for  a  detailed  description  of  the  parameters  and 
their  recoding  for  this  colocation  study) .  In  addition,  a  vertically  com¬ 
pacted  set  of  cloud  cover  values,  which  correspond  to  boundary  layer  clouds 
and  cloud  cover  for  layers  surrounding  the  6  mandatory  levels  between  1000  hPa 
and  300  hPa  were  derived  from  the  15  layer  values.  The  vertical  compaction, 
described  in  more  detail  in  Appendix  B,  reduces  the  data  volume  without  a  sig¬ 
nificant  loss  of  information  since  the  cloud  cover  values  in  the  15  3DNEPH 
layers  are  partially  redundant.  The  3DNEPH  data  is  derived  primarily  from  IR 
and  visible  satellite  imagery,  and  supplemented  with  conventional  surface, 
radiosonde,  and  aircraft  reports.  For  the  present  study  we  extracted  3DNEPH 
data  for  the  gridpoints  closest  to  the  extracted  radiosonde  stations;  in  addi¬ 
tion  to  the  values  at  the  gridpoint  itself  (referred  to  in  the  following  as 
central  values) ,  the  neighboring  24  points  were  used  to  compute  means  and 
standard  deviations.  Data  were  extracted  for  the  3DNEPH  boxes  43,  44,  and 
45.  The  69  sounding  locations  that  fall  within  the  extracted  3DNEPH  boxes  are 
shown  in  Fig.  1  as  stars. 

4.  Regression  Study 

The  data  base  described  in  the  previous  section  was  used  to  develop  and 
test  a  new  method  to  infer  a  RH  profile  from  3DNEPH  data.  The  data  set  for 
each  month  was  subdivided  into  a  dependent  and  independent  sample;  the  former 
was  used  to  develop  multiple  regression  equations  for  the  EOF  coefficients  of 
relative  humidity,  the  latter  was  used  to  validate  the  regression  equations 
and  to  compute  error  statistics  of  the  bogus  RH  data. 

The  dependent  sample  for  February,  which  covers  data  from  00  UTC  5 
February  1979  to  12  UTC  16  February  1979,  was  used  for  some  exploratory,  uni¬ 
variate  correlation  calculations  between  3DNEPH  data  and  EOF  coefficients. 
Aside  from  identifying  the  most  promis ing- looking  predictors  from  the  3DNEPH 
data  set,  these  computations  were  used  to  assess  the  effect  of  stratifying  the 
sample  into  various  subsets.  One  such  stratification,  based  on  the  time  of 
day,  showed  correlation  coefficients  to  be  consistently  higher  for  the  00  UTC 
sample  than  for  the  12  UTC,  or  the  aggregate  00  UTC  and  12  UTC  sample.  Con¬ 
trol  runs  using  a  random  subsample  showed  these  differences  to  be  signifi¬ 
cant.  The  most  likely  reason  is  that  the  cloud  data  over  the  U.S.  are  more 
reliable  and  less  noisy  at  00  UTC,  because  the  primary  data  source  of  the 
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3DNEPH  is  IR  imagery  and  in  the  late  afternoon  the  temperature  contrast  be¬ 
tween  the  land  surface  and  the  cloud  tops  is  greatest. 

Correlation  coefficients  between  one  of  the  3DNEPH  variables  and  the  co¬ 
efficients  of  either  EOF  1  or  2  showed  no  consistent  differences  between  lati¬ 
tude  bands  (30°-40"N  and  40°-50°N);  similar  correlation  coefficients  for  EOF  3 
were  nonzero  only  in  the  30°-40°N  sample.  Finally,  these  correlation  coeffi¬ 
cients  of  EOF  1  and  2  were  consistently  higher  for  the  horizontally  averaged 
3DNEPH  values  than  the  corresponding  central  values,  while  the  standard  devia¬ 
tions  of  the  3DNEPH  variables  showed  no  useful  correlations  at  all. 

A  stepwise  regression  procedure  was  used  to  derive  multiple  regression 
equations  for  the  coefficients  of  EOF  1  and  2.  Stepwise  regression  is  a  meth¬ 
od  for  determining  the  "best"  set  of  predictors  (Neter  and  Wasserman,  1974, 
Chapter  11;  Draper  and  Smith,  1966,  Chapter  6).  In  our  case,  some  of  the 
3DNEPH  data  have  large  biases  or  random  observational  errors  and  many  of  the 
variables  are  intercorrelated .  We  have  the  original  data,  the  vertically  com¬ 
pacted  data,  as  well  as  the  local  area  average  and  variance  of  the  original 
and  vertically  compacted  data.  On  the  dependent  sample,  we  are  sure  to  ex¬ 
plain  more  of  the  variance  of  the  EOF  coefficients  for  each  predictor  we  add 
to  our  regression  equations.  However,  using  all  potential  predictors  is  sure 
to  overfit  the  data,  resulting  in  poorer  performance  on  the  independent  sam¬ 
ple.  Therefore  we  must  employ  some  means  of  deciding  which  predictors  to  use. 

The  stepwise  regression  procedure  iteratively  adds  and  deletes  variables 
from  the  prediction  equation.  At  each  step,  all  possible  predictors  not  yet 
included  in  the  regression  are  co^idered  for  inclusion.  The  one  reducing  the 
unexplained  vai iance  the  most  is  added  to  the  regression  if  this  reduction  in 
variance  is  significant,  as  judged  by  the  coefficient  of  partial  correlation, 
which  is  a  type  of  F  statistic  (op.  cit.).  Since  the  predictor  added  may  be 
highly  correlated  with  a  previous  predictor,  all  current  predictors  are  con¬ 
sidered  for  deletion.  The  predictor  having  the  smallest  partial  correlation 
is  deleted  if  the  associated  reduction  in  variance  is  so  small  as  to  be  insig¬ 
nificant.  The  iteration  stops  when  no  further  changes  are  made.  The  algo¬ 
rithm  we  used  is  based  on  that  of  Efroymson  (1960).  Based  on  the  exploratory, 
univariate  correlation  statistics,  some  preliminary  choices  of  predictors  were 
made:  In  order  to  use  the  best  available  data  for  the  development  of  the  mul- 
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tiple  regression  equations,  only  00  UTC  data  was  used;  multiple  regression 
equations  were  based  on  the  horizontal  mean  values  only  (tests  of  multiple  re¬ 
gression  equations  using  central  values  resulted  in  less  skillful  predictions 
of  the  EOF  coefficients);  multiple  regression  equations  for  EOFs  1  and  2  were 
developed  without  any  stratification  based  on  latitude.  The  multiple  regres¬ 
sion  equation  for  EOF  3  for  the  30° -40°N  latitude  band  was  able  to  explain 
only  30%  of  its  variance.  Since  EOF  3  only  contributes  a  small  amount  of  the 
RH  variance,  its  prediction  was  not  pursued  further.  Two  separate  regression 
equations  were  developed  for  each  EOF,  one  using  only  the  vertically  compacted 
croud  cover  data,  and  another  using  all  mean  3DNEPH  variables.  In  the  latter 
case,  cloud  cover  data  for  3DNEPH  layers  were  added  to  the  predictor  sets  by 
the  stepwise  regression  algorithm  only  after  the  vertically  compacted  predict¬ 
ors  were  exhausted.  The  corresponding  increases  in  the  amount  of  the  explain¬ 
ed  variance  were  rather  modest,  however,  because  of  high  correlations  between 
the  vertically  compacted  and  the  layer  data.  An  example  of  this  is  shown  in 
Fig.  6:  scatterplots  of  EOF  1  coefficients  versus  cloud  cover  in  layer  10 
show  a  relatively  high  correlation,  whereas  the  residuals  from  the  regression 
equation  using  the  vertically  compacted  data  are  already  much  less  correlated, 
and  including  layer  10  cloud  cover  in  the  regression  then  removes  the  remain¬ 
ing  correlation.  Table  2  shows  the  predictor  sets  and  the  fraction  of  the  ex- 

O 

plained  variance  (r  )  for  the  coefficients  of  EOF  1  and  2,  both  for  the  global 

mean  and  latitude -dependent  mean  (EOF-A11  and  EOF-Bands),  for  February  and 

2 

June.  We  note  that  the  r  values  are  slightly  lower  for  EOF-Bands  than 
EOF -All . 

An  evaluation  of  these  regression  equations  was  performed  both  for  the 
dependent  sample  and  the  independent  sample.  For  a  comparison  of  the  regres¬ 
sion  equation  and  existing  methods  of  estimating  humidity  the  bogus  RH  pro¬ 
files  were  interpolated  from  the  ^-coordinate  system  to  mandatory  pressure 
levels.  For  the  purposes  of  interpolating,  RH  was  assumed  to  vary  linearly  as 
a  function  of  the  logarithm  of  pressure.  The  interpolated  bogus  RH  profiles 
were  then  compared  with  colocated  RAOB  data,  along  with  bogus  RH  obtained  bv 
existing  methods.  Bias  and  rmse  statistics  were  computed  over  all  colocated 
3DNEPH  data  points,  even  if  the  cloud  cover  used  in  the  inference  was  zero;  to 
reduce  the  positive  bias  likely  to  result  from  this  procedure,  the  bogus 
humidity  predicted  by  the  existing  methods  was  modified  to  be  no  higher  than 
climatology  in  cases  of  zero  cloud  cover.  A  recalculation  of  the  statistics. 


8 


o 

Table  2:  Predictor  sets  and  r  for  the  regression  equations.  Results  are 

shown  separately  for  EOF  1  and  2.  The  predictor  set  of  the  verti¬ 
cally  compacted  set  is  shown  first,  additional  3DNEPH  data  of  the 
second  regression  equation  are  shown  in  the  second  column;  rj  denotes 
cloud  cover,  r^  the  fraction  of  variance  explained  by  the 
regression . 


Vertically  com&acted  data 

Full  3DNEPH  data 

set 

Regress  ion 
equation 

|  Predictors 

1  r2 

|  Predictors 

1  r2 

EOF  1; 

1 

1 

I 

1 

| 

1 

1 

| 

1 

1 

| 

EOF-All 

February 

1 

|  j)  at  850,  700,  500 

1 

i 

hPa 

1 

i 

|  .554 
| 

1 

|  rj  at  layer  10 

1 

1 

1 

1 

|  .568 

1 

EOF -  Bands 
February 

1 

\rj  at  700,  500,  400 

1 

i 

hPa 

1 

I 

|  .514 

1 

1 

|  rj  at  layer  10 

1 

1 

1 

1 

j  .530 

1 

EOF-All 

June 

i 

|  r)  at  700,  400  hPa 

1 

1 

1 

|  .610 

1 

|  t)  at  layer  15 

1 

1 

1 

j  .623 

1 

EOF -Bands 
June 

1 

|  r)  at  700,  400  hPa 

1 

1 

1 

|  .606 

1 

|r;  at  layer  15 

1 

1 

i 

|  .618 

1 

EOF  2: 

1 

1 

1 

1 

1 

1 

I 

i 

i 

EOF-All 

February 

1 

| n  at  850,  700 
|  500,  300  hPa 

1 

1 

|  .413 

i 

1 

|rj  at  layer  9 

1 

1 

i 

i 

|  .434 

l 

EOF -  Bands 
February 

1 

|  r)  at  850,  700 
|  500,  300  hPa 

1 

1 

|  .394 

1 

|  >7  at  layer  9 

1 

i 

1 

1 

|  .414 

l 

EOF-All 

June 

1 

| terrain  height 
|  rj  at  850,  500  hPa 

1 

i 

i 

i 

|  .333 

1 

|  rj  at  700  hPa 
| layer  8,  12 
|low  cloud  type 

i 

i 

i 

|  .484 

EOF -  Bands 
June 

1 

| terrain  height 
|  rj  at  700,  400  hPa 

1 

1 

|  .328 

1 

|  rj  at  layer  8,12 
|low,  middle  cloud 

type 

1 

1 

|  .481 

excluding  all  cases  in  which  cloud  cover  values  used  as  predictors  were  less 
than  10%  showed  that  the  results  were  not  significantly  affected  by  the  use  of 
cases  with  zero  cloud  cover. 

Table  3  shows  the  RMS  error  and  bias  of  three  existing  methods  for  the 
time  period  corresponding  to  the  dependent  sample  in  February.  The  smaller 
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sample  sizes  at  the  1000  hPa  and  850  hPa  levels  shown  in  Table  3  are  due  to 
observations  with  small  surface  pressures,  i.e.  high  station  elevation.  As  an 
independent  reference,  errors  associated  with  climatology  are  shown  as  well; 
climatology  is  defined  here  as  the  average  over  all  radiosonde  observations  in 
the  extracted  set  within  the  3DNEPH  boxes,  computed  separately  for  each 
month.  Errors  for  the  GWC  and  Tibaldi  method  are  roughly  comparable;  *cth  me¬ 
thods  show  some  skill  when  compared  to  climatology.  The  inverse  ECMWF  formula 
suffers  from  a  rather  large  bias.  The  corresponding  statistics  for  the  bogus 
RH  profiles  based  on  the  regression  equations  for  EOFs  1  and  2  are  shown  in 
Table  4.  The  smaller  sample  sizes  at  the  400  hPa  level  shown  in  Table  4  are 
caused  by  observations  with  surface  pressure  of  above  967  hPa,  for  which  the 
top  sigma  level  is  below  400  hPa.  It  can  be  seen  that  the  regression  equa¬ 
tions  result  in  smaller  errors  than  the  existing  methods;  differences  between 
the  different  regression  equations,  i.e.  those  for  EOF-All  or  EOF-Bands,  and 
those  using  the  vertically  compacted  or  the  full  3DNEPH  data  set  are  generally 
small . 


Table  3:  Bits  and  RMS  errors  for  the  existing  methods,  for  the  dependent 

sample  in  February.  The  sample  sizes  for  each  statistic  are  shown  in 
parentheses.  The  ro<  labeled  "Average"  is  an  unweighted  average  of  the 
level  values. 

inverse 

AFGWC  Tibaldi  ECMWF  Climatology 

|  Bias  RMSE  |  Bias  RMSE  |  Bias  RMSE  |  Bias  RMSE  | 


1000  hPa 

|  12.01 

17.87 

1 

12.03  " 

17.71 

1  35.32 

40.96 

1 

7.36 

21.49 

1  (82) 

I 

(82) 

1  (82) 

1 

(211) 

850  hPa 

|  8.85 

26.09 

1 

12.45 

27.69 

1  19.69 

31.02 

1 

5.71 

32.45 

1  (276) 

1 

(276) 

1  (276) 

1 

(440) 

700  hPa 

|  14.44 

31.50 

1 

12.30 

30.18 

|  25.08 

37.31 

1 

2.47 

33.28 

1  (302) 

1 

(302) 

1  (302) 

1 

(470) 

500  hPa 

|  1.21 

26.13 

1 

3.92 

26.21 

|  23.35 

34.48 

1 

-  .  77 

31.61 

1  (303) 

1 

(303) 

1  (303) 

1 

(468) 

400  hPa 

|  -5.33 

23.24 

1 

4.25 

23.48 

1  28.05 

37.33 

1 

-2.50 

28.89 

1  (303) 

1 

(303) 

1  (303) 

1 

(468) 

Average 

|  6.24 

24.97 

1 

8.99 

25.05 

|  26.48 

36.22 

"T 

2.45 

29.54 

"7 
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Table  4:  As  Table  3,  but  for  the  regression  equations.  Set  1  and  ?.  refer  to 

the  vertically  compacted  and  the  full  3DNEPH  data  set,  respectively;  EOF- 
All  and  EOF-Bands  denote  EOFs  based  on  latitude- independent  and  latitude- 
dependent  mean  RH  profiles,  respectively. 

E0F-A11,  set  1  E0F-A11,  set  2  EOF-Bands , set  1  EOF-Bands , set  2 


|  Bias 

RMSE 

|  Bias  RMSE 

Bias 

RMSE 

Bias  RMSE 

1 

1000 

hPa 

|  10.83 

18.80 

|  10.53  18.72 

9.37 

18.07 

9.14  18.12 

T~ 

850 

hPa 

1  (82) 

|  2.13  23.27 

1  (82) 
j  1.99  23.07 

(82) 

1.43  22.46 

(82) 

1.36  22.38 

i 

i 

700 

hPa 

1  (276) 

|  -.77  26.58 

1  (276) 

|  -1.18  26.36 

(276) 

-1.00  26.28 

(276) 

-1.21  25.82 

i 

i 

500 

hPa 

1  (289) 

|  1.31  25.44 

I  (289) 

j  .69  25.28 

(289) 

1.03  25.06 

(289) 

.63  24.96 

i 

i 

400 

hPa 

1  (291) 

|  4.20  22.40 

I  (291) 

|  2.78  22.88 

(291) 

6.17  21.94 

(291) 

5.18  22.44 

i 

i 

1  (106) 

1  (106) 

(106) 

(106) 

i 

Average 

|  3.54 

23.30 

|  2.96  23.26 

3.40 

22.76 

3.02  22.75 

"1” 

Table  5  and  Table  6  show  the  verification  statistics  for  the  independent 
sample  in  February,  which  covers  the  period  from  00  UTC ,  17  February,  through 
00  UTC,  22  February.  Because  of  the  small  differences  between  the  regression 
equations  using  the  vertically  compacted  and  the  full  3DNEPH  data  set,  only 
the  results  for  the  former  are  shown  here.  The  results  are  qualitatively  the 
same  for  the  two  sanples  in  February.  As  is  to  be  expected,  the  errors  for 
the  regression  equations  are  somewhat  larger  for  the  independent  than  the  de¬ 
pendent  sample,  but  they  are  larger  for  the  existing  methods,  as  well. 
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Table  5:  As  Table  3,  except  for  the  independent  sample  in  February. 


inverse 

AFGWC  Tibaldi  ECMWF  Climatology 


|  Bias 

RMSE 

1 

Bias 

RMSE 

1 

Bias 

RMSE 

1 

Bias 

RMSE 

1 

1000  hPa 

1  2.29 

17.78 

T 

2.65 

17.20 

l' 

26.10 

32.72 

7 

-7.74 

20.80 

T 

I  ( 104) 

i 

(104) 

1 

(104) 

! 

(230) 

i 

850  hPa 

1  12.76 

29.03 

i 

15.65 

27.27 

1 

23.10 

35.03 

1 

1.56 

33.51 

i 

1  (254) 

i 

(254) 

1 

(254) 

1 

(412) 

i 

700  hPa 

1  20.59 

34.14 

i 

16.94 

32.33 

1 

30.97 

41.38 

1 

4.44 

33.59 

i 

1  (296) 

i 

(296) 

1 

(296) 

1 

(457) 

i 

500  hPa 

1  -7.90 

27.55 

i 

11.17 

29.26 

1 

29.84 

39.52 

1 

4.65 

32.06 

i 

1  (288) 

i 

(288) 

t 

(288) 

1 

(458) 

i 

400  hPa 

|  -3.46 

27.63 

i 

7.58 

27.50 

1 

30.49 

41.49 

1 

-  .23 

29.96 

i 

1  (298) 

i 

(298) 

1 

(298) 

I 

(469) 

i 

Average 

|  8.14 

27.22 

i 

10.80 

27.59 

i 

28.10 

38.03 

i 

0.54 

29.98 

T 

Table  6:  As  Table  4,  except  for  the  independent  sample  in  February.  Results 

are  shown  for  the  regression  equations  based  on  the  vertically 
compacted  3DNEPH  data  set  only. 


EOF-All  EOF- Bands 

|  Bias  RMSE  |  Bias  RMSE  | 


1000  hPa 

|  -1.01  17.81 

1 

-1.75  18.35 

1 

1  (104) 

1 

(104) 

1 

850  hPa 

|  3.81  25.39 

1 

3.79  26.44 

1 

1  (?54) 

1 

(254) 

1 

700  hPa 

|  5.36  26.60 

1 

5.01  27.31 

1 

1  (284) 

1 

(284) 

1 

500  hPa 

|  7.86  25.87 

1 

7.21  25.33 

1 

1  (276) 

1 

(276) 

1 

400  hPa 

|  5.26  24.93 

1 

4.39  23.76 

1 

1  (71) 

I 

(71) 

1 

Average 

|  4.26  24.12 

T 

3.73  24.24 

I 

The  error  statistics  for  the  existing  methods  computed  here  can  be  com¬ 
pared  to  the  results  obtained  by  Norquist  (1988)  in  his  global  colocation 
study  for  February  of  1979.  His  results  are  similar  in  that  the  errors  of  the 
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AFGWC  and  Tibaldi  m>thods  are  roughly  comparable,  with  slightly  smaller  errors 
for  the  Tibaldi  method.  In  general,  his  results  show  somewhat  smaller  errors 
for  all  three  methods,  except  for  the  AFGWC  method  at  400  hPa,  which  shows 
smaller  errors  in  our  results.  The  positive  bias  of  the  inverse  ECMWF  method 
found  here  is  consistent  with  his  results,  and  also  consistent  with  the  re¬ 
sults  of  Saito  and  Baba  (1988)  . 

Tables  7  and  8  contain  the  verification  statistics  for  the  independent 
sample  in  June,  which  covers  the  period  12  UTC,  June  17,  through  00  UTC ,  June 
24.  They  show  a  smaller  variance  about  climatology  than  for  February,  in 
agreement  with  the  results  shown  in  Fig.  2.  The  errors  for  the  existing 
methods,  on  the  other  hand,  show  little  change  compared  to  February,  resulting 
in  only  marginal  or  no  skill  over  climatology  for  the  AFGWC  or  the  Tibaldi 
method.  The  errors  of  the  regression  equations  are  smaller  for  June  than  for 
February,  and  they  are  consistently  smaller  than  those  of  climatology  or  the 
existing  methods . 


Table  7:  As  Table  5,  but  for  the  independent  sample  in  June. 


inverse 


AFGWC 

|  Bias  RMSE 

1 

Tibaldi 

Bias  RMSE 

1 

ECMWF 

Bias  RMSE 

1 

Climatology 
Bias  RMSE 

1 

1000 

hPa 

|  6.82  16.95 

1 

2.72  15.34 

“l 

27.68  33.35 

T 

-3.31  17.62 

I 

850 

hPa 

1  (U8) 

|  8.16  21.47 

1 

1 

(118) 

7.50  21.70 

1 

1 

(118) 

17.15  25.59 

1 

1 

(298) 

-2.06  23.38 

1 

I 

700 

hPa 

1  (540) 

|  20.20  30.90 

1 

1 

(540) 

15.74  28.29 

1 

1 

(540) 

30.06  38.10 

1 

1 

(774) 

4.11  27.31 

1 

I 

500 

hPa 

1  (585) 

|  14.77  27.44 

1 

1 

(585) 

19.99  31.21 

1 

1 

(585) 

34.80  42.09 

1 

1 

(809) 

14.59  30.26 

1 

[ 

400 

hPa 

1  (564) 

|  10.84  25.09 

1 

1 

(564) 

22.76  31.90 

1 

1 

(564) 

39.03  46.45 

1 

1 

(784) 

16.88  28.89 

1 

I 

1  (584) 

1 

(584) 

1 

(584) 

1 

(804) 

1 

Average  |  12.16  24.37  |  13.74  25.69  |  29.74  37.12  |  6.04  25.49  | 


Table  8:  As  Table  6,  but  for  the  independent  sample  in  June. 


EOF- All  EOF- Bands 

|  Bias  RMSE  |  Bias  RMSE 


1000  hPa 

|  -2.57  16.13 

!  (118) 

1 

1 

-4.06  16.04 

(118) 

1 

1 

850  hPa 

|  -6.51  20.88 

I  (540) 

t 

1 

-6.30  20.80 

(540) 

1 

1 

700  hPa 

|  -5.54  26.06 

S  (557) 

1 

1 

-4.79  25.88 

(557) 

1 

1 

500  hPa 

I  3.95  23.60 

1  (539) 

1 

1 

3.52  23.55 

(539) 

i 

1 

400  hPa 

I  7.82  22.65 

j  (196) 

1 

1 

8.06  22.63 

(196) 

1 

1 

Average 

|  -.57  21.87 

T 

-.71  21.78 

T 

In  summary,  we  identified  regression  equations  for  the  first  two  EOFs  of 
relative  humidity;  several  different  possibilities  were  investigated  which 
differed  in  the  definition  of  the  EH  EOFs  and  in  the  subset  of  the  3DNEPH  data 
considered  as  predictors.  Based  on  tests  using  the  dependent  and  independent 
samples,  it  was  found  sufficient  to  use  regression  equations  for  the  coeffi¬ 
cients  of  E0F-A11 ,  i.e.  the  EOFs  based  on  the  mean  RH  profile  computed  for  all 
latitudes  between  20°  and  50°N,  and  to  use  the  vertically  compacted,  horizon¬ 
tally  averaged  3DNEPH  data  set.  Comparison  with  existing  level- to- level  me¬ 
thods  for  inferring  RH  from  cloud  cover  showed  the  regression  equations  to 
perform  better  for  both  the  dependent  and  independent  samples.  It  should  be 
noted  that  the  verification  was  performed  over  the  same  geographical  region 
for  which  the  regression  equations  were  developed,  and  that  no  attempt  was 
made  to  tune  the  existing  methods  to  improve  their  performance.  The  compari¬ 
son  with  the  level-to-level  methods  was  performed  primarily  to  provide  some 
independent  reference  point  for  the  errors  associated  with  the  regression 
equations,  and  to  demonstrate  the  feasibility  and  potential  utility  of  the 
approach. 


5.  Data  Impact  Study 


The  method  for  inferring  relative  humidity  from  3DNEPH  data  described  in 
the  previous  section  was  used  in  a  data  impact  study  using  the  global  data  as¬ 
similation  system  (GDAS)  of  the  Air  Force  Geophysics  Laboratory  (AFGL) .  The 
AFGL  GDAS  consists  of  three  major  components:  a  global  spectral  forecast  model 
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(GSM),  an  optimum  interpolation  analysis  (01),  and  a  nonlinear  normal  mode 
initialization  (NMI).  The  global  spectral  model  is  based  on  the  NMC  GSM  de¬ 
signed  by  Sela  (1980);  the  physics  routines  were  taken  almost  intact  from  NMC 
(circa  1983),  whereas  the  hydrodynamics  were  completely  redesigned  (Brenner  et 
al.,  1982,  1984).  The  optimum  interpolation  analysis  was  developed  by 
Norquist  and  others  (Norquist,  1982b,  1983,  1984,  1986;  Halberstam  et  al., 
1984) ,  and  was  originally  based  on  the  01  procedures  described  in  Bergman 
(1979)  and  McPherson  et  al.  (1979).  The  normal  mode  initialization  was  based 
on  the  NMC  NMI  (Ballish,  1980)  and  is  described  in  Norquist  (1982a)  and  Tung 
(1983) . 

The  design  of  the  data  assimilation  experiment  closely  follows  that  de¬ 
scribed  in  Louis  et  al .  (1987).  An  assimilation  experiment  consists  of  as¬ 
similation  runs  for  two  7-day  periods  in  the  Special  Observing  Periods  (SOPs) 
of  the  FGGE  year:  February  8  through  15,  1979,  and  June  17  through  24,  1979. 
Each  assimilation  run  consists  of  a  series  of  assimilation  cycles  using  a  6 
hour  update  cycle.  Forecasts  out  to  4  days  were  produced  from  the  initialized 
analyses  at  days  3,  5,  and  7  of  the  assimilation  runs.  The  assimilation  pe¬ 
riod  of  the  June  assimilation  run  is  identical  with  the  independent  sample  for 
that  month  used  in  our  colocation  study.  In  February,  the  data  assimilation 
period  overlaps  the  dependent  sample  used  in  the  regression  study. 

In  this  section  we  will  present  mostly  differences  between  an  assimila¬ 
tion  experiment  using  the  3DNEPH  based  bogus  RH  data  (referred  to  as  NEPHSAT) , 
and  a  control  assimilation  experiment  using  only  the  standard  FGGE  data  set 
(STATSAT) .  The  only  moisture  data  used  in  STATSAT  were  radiosonde  observa¬ 
tions;  for  an  in-depth  discussion  of  the  control  run  the  reader  is  referred  to 
Louis  et  al .  (1987).  The  01  analysis  program  had  undergone  some  minor  changes 
between  the  STATSAT  and  NEPHSAT  experiments:  Changes  were  made  to  the  quality- 
control  procedures  of  drop-windsonde  data,  and  to  the  procedures  to  solve  the 
normal  equations  of  the  analysis  program  (Hoffman  et  al.,  1988).  This  impact 
test  differs  from  the  one  reported  in  Norquist  (1988),  in  which  bogus  RH  data 
was  used  to  replace,  rather  than  supplement,  radiosonde  measurements.  The 
present  0SE  is  designed  to  test  whether  the  potential  impact  demonstrated  by 
Norquist  can  be  observed  in  a  more  realistic  simulation  of  the  operational 
environment . 
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The  3DNEPH  based  bogus  RH  data  were  generated  for  all  half-mesh  points 
located  between  30°N  and  50°N  (since  only  those  latitudes  were  used  in  the 
derivation  of  the  regression  equations);  the  error  statistics  for  these  data, 
which  are  needed  as  input  to  the  moisture  01,  were  generated  from  the  inde¬ 
pendent  sample  of  the  colocation  study  (see  Appendix  C  for  details). 

The  impact  of  the  3DNEPH  data  is  most  clearly  seen  in  Fig.  7,  which  shows 
differences  of  analyzed  RH  at  850  hPa  between  NEPHSAT  and  STATSAT  for  the 
February  assimilation  run.  The  first  analysis  produced  in  February  (February 
8  at  06Z)  shows  differences  to  be  essentially  confined  to  the  region  influ¬ 
enced  by  the  3DNEPH  data,  i.e.  30°-50°N;  an  exception  to  this  are  the  high 
latitude  regions  of  both  hemispheres,  where  sizable  differences  occur,  due 
most  probably  to  differences  in  the  01  program  between  the  two  experiments. 

As  is  obvious  from  the  succeeding  panels  in  Fig.  7,  the  differences  within  the 
30°-50°N  latitude  band  as  well  as  outside  it  grow  with  time.  By  12Z  on 
February  10,  i.e.  after  2  1/2  days  of  assimilation,  the  RH  differences  have 
spread  over  the  entire  globe,  with  the  largest  differences  occurring  at  high 
latitudes;  at  that  time,  the  region  where  bogus  RH  data  were  used  in  NEPHSAT 
is  no  longer  visibly  different  from  the  rest  of  the  world  in  these  difference 
maps.  This  fairly  rapid  growth  and  spread  of  the  RH  differences  indicate  that 
the  impact  of  the  3DNEPH  data  is  within  the  noise  level  of  the  system,  since 
the  initial  differences  are  clearly  caused  by  both  the  different  input  data 
and  the  different  analysis  programs.  During  the  early  part  of  the  assimila¬ 
tion  run,  however,  the  RH  differences  within  the  30°-50°N  latitude  band  can  be 
related  to  the  cloudiness  data  used  in  NEPHSAT.  Plots  of  the  700  hPa  cloud 
cover  (Fig.  8),  which  is  the  predictor  with  most  influence  on  bogus  RH  data 
for  the  850  hPa  level,  reveal  some  areas  of  little  cloud  cover,  particularly 
over  the  middle  and  East  Atlantic,  which  correspond  to  areas  where  the  850  hPa 
NEPHSAT  analyses  are  drier  than  the  control. 

The  analyses  of  geopotential  height  are  quite  similar  in  the  two  experi¬ 
ments.  During  February,  differences  in  the  Northern  Hemisphere  are  localized 
and  of  small  amplitude  (less  than  100  m  at  500  hPa,  150  m  at  1000  hPa) 
throughout  the  entire  assimilation  run;  in  the  Southern  Hemisphere,  large  amp¬ 
litude,  small  scale  differenceo  are  evident  near  the  pole,  which  are  the  re¬ 
sult  of  the  differences  in  the  analysis  programs  and  not  related  to  the  use  of 
bogus  RH  data. 
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To  assess  the  quality  of  the  analyses  and  the  forecasts  produced  from 
them,  colocation  statistics  between  the  gridded  fields  and  a  set  of  verifying 
radiosonde  observations  were  computed.  The  radiosonde  observations  used  in 
the  regression  study  were  extracted  from  this  data  set.  Fig.  9a  shows  the 
global  rms  error  of  relative  humidity  for  NEPHSAT  and  STATSAT  analyses  and 
forecasts  for  February.  Although  the  NEPHSAT  errors  are  slightly  smaller  than 
those  of  STATSAT  for  most  of  the  analyses,  these  differences  (NEPHSAT- STATSAT) 
are  smaller  than  the  day-to-day  variations  of  the  analysis  differences 
(analysis-RAOB) .  The  forecasts  do  not  indicate  one  to  be  superior  to  the 
other.  The  same  general  conclusions  hold  if  one  computes  these  statistics  of 
just  the  Northern  Hemisphere  extratropics,  or  even  just  over  North  America 
(Fig.  9b  and  c) ,  where  the  beneficial  impact  of  the  3DNEPH  data  would  be  ex¬ 
pected  to  be  largest.  Comparing  the  magnitudes  of  the  rms  errors  shown  in 
Fig.  9  with  those  of  Table  6  might  explain  part  of  the  reason  for  this  appar¬ 
ent  lack  of  improvement:  the  typical  analysis  errors  are  smaller  than,  and  the 
typical  12-hour  forecast  errors  are  only  slightly  larger  than,  the  colocation 
errors  of  the  bogus  RH  data.  The  quality  of  the  bogus  RH  data  is  thus  compar¬ 
able  to  that  of  the  first  guess,  resulting  in  only  a  small  positive  impact 
even  in  radiosonde -void  regions.  The  generally  inconclusive  results  of  the 
850  hPa  radiosonde  statistics  hold  for  other  levels,  as  well. 

The  results  from  the  June  assimilation  run  are  quite  similar  to  those  for 
February.  Difference  maps  of  RH,  shown  in  Fig.  10,  reveal  the  impact  of  the 
cloud  data  in  the  first  analysis  time  periods,  along  with  large  diffferences 
near  the  South  Pole,  which  are  related  to  the  different  analysis  program.  As 
was  the  case  in  February,  the  differences  spread  quickly  over  the  entire 
globe,  until  at  day  2  1/2  the  region  with  cloud  data  input  is  no  longer  dis¬ 
tinguishable  from  the  rest  of  the  globe.  The  3DNEPH  cloud  cover  data  shown  in 
Fig.  11  shows  some  features  that  can  be  related  to  the  RH  differences,  in  par¬ 
ticular  an  area  of  positive  RH  differences  and  high  values  of  cloud  cover  at 
150*W. 

The  radiosonde  statistics  for  the  June  assimilation  also  give  no  clear 
indication  that  the  NEPHSAT  forecasts  have  smaller  RH  errors  than  tho'-r  cf 
STATSAT . 


6.  Summary  and  Conclusions 


Based  on  a  colocation  study  of  radiosonde  RH  measurements  and  3DNEPH 
cloudiness  data  over  North  America,  we  developed  regression  equations  for  the 
first  two  EOFs  of  relative  humidity.  The  regression  equations  predict  the  co¬ 
efficients  of  E0F-A11,  i.e.  the  EOFs  based  on  the  mean  RH  profile  computed  for 
all  latitudes  between  20°  and  50*N,  from  the  vertically  compacted,  horizont¬ 
ally  averaged  3DNEPH  data  set.  The  regression  equations  performed  better  than 
existing  level -to -level  methods  for  inferring  RH  from  cloud  cover.  Although 
the  verification  was  performed  over  the  same  geographical  region  for  which  the 
regression  equations  were  developed,  and  no  attempt  was  made  to  tune  the 
existing  methods  to  improve  their  performance,  this  comparison  establishes  the 
feasibility  of  the  approach. 

The  utility  of  the  bogus  RH  data  for  operational  data  assimilation  was 
investigated  in  a  global  observing  system  experiment,  in  which  bogus  RH  data 
were  supplied  to  the  moisture  analysis  in  the  30°-50°N  latitude  belt.  The  im¬ 
pact  of  the  3DNEPH  data  was  evident  in  analyses  in  the  early  part  of  the  as¬ 
similation  runs;  at  later  times,  it  was  more  difficult  to  separate  the  effects 
of  the  bogus  RH  data  from  other  differences  between  the  NEPHSAT  and  control 
OSE.  Comparisons  of  the  analyzed  and  forecast  RH  with  verifying  radiosonde 
data  did  not  indicate  a  measurable  positive  impact  of  the  bogus  RH  data.  The 
inconclusive  results  from  this  OSE  should  not  be  regarded  as  definitive,  how¬ 
ever;  we  address  the  reasons  for  the  lack  of  positive  impact,  and  suggest  pos¬ 
sible  extensions  to  the  present  study  in  the  following. 

One  of  the  obvious  shortcomings  of  the  present  OSE  is  the  limited  geo¬ 
graphical  extent  of  the  bogus  RH  data.  In  future  studies,  the  RH  profile  ap¬ 
proach  could  be  extended  to  produce  a  global  bogus  RH  data  set  by  repeating 
the  regression  study  performed  here  for  different  regions  of  the  globe.  Dif¬ 
ferent  EOFs,  and  different  regression  equations  would  then  be  used  in  dif¬ 
ferent  regions. 

Other  limitations  of  the  OSE  are  related  to  the  data  assimilation  system 
itself.  Among  those  the  relatively  coarse  resolution  of  the  analysis  and 
forecast,  the  use  of  an  adiabatic  NMT ,  and  of  a  very  simple  moist  physics 
package  in  the  GSM  are  the  most  significant  obstacles  to  an  effective  assimil¬ 
ation  of  moisture  data.  There  are  several  potential  remedies  to  these  short- 
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comings:  using  a  diabatic  NMI ,  in  conjunction  with  a  moisture  spinup  procedure 
as  suggested  in  Donner  (1988)  would  minimize  the  rejection  of  initial  moisture 
data  by  the  forecast  model.  Improvements  to  the  physics  package  of  the  GSM 
are  also  necessary  to  limit  the  error  growth  during  the  assimilation  cycle  and 
the  longer  range  forecasts  produced  from  the  analyses;  the  physics  package 
currently  being  implemented  and  tested  by  AFGL  is  expected  to  improve  this  as¬ 
pect  of  the  GDAS. 

Perhaps  the  most  serious  limitation  to  the  usefulness  of  the  data  are  the 
relatively  large  observation  errors  of  the  bogus  RH,  which  are  larger  than  the 
globally  averaged  errors  of  the  current  RH  analyses.  Even  with  the  current 
error  levels,  however,  some  beneficial  impact  should  be  realizable  in  other¬ 
wise  data-void  areas.  It  may  also  be  possible  to  reduce  the  errors  of  the 
bogus  RH  data  with  changes  in  the  regression  approach,  such  as  the  definition 
of  the  EOFs  or  the  preprocessing  of  the  cloud  data.  However,  for  significant 
reductions  of  the  observation  errors,  it  will  be  necessary  to  take  account  of 
the  fact  that  there  is  no  one-to-one  correspondence  between  relative  humidity 
and  cloud  cover,  and  to  include  other  atmospheric  parameters  (e.g.,  static 
stability,  vertical  motion)  in  the  problem. 
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Appendix  A:  Existing  cloudiness  to  humidity  conversion  techniques 

Three  existing  methods  of  estimating  relative  humidity  on  mandatory  pres¬ 
sure  levels  were  implemented  in  our  colocation  study.  All  three  methods  were 
used  with  the  horizontal  mean  values  of  the  vertically  compacted  3DNEPH  cloud 
cover  corresponding  to  each  mandatory  pressure  level.  In  cases  of  zero  cloud 
cover,  the  smaller  of  the  critical  relative  humidity  and  climatology  was  used 
as  the  bogus  RH .  In  the  case  of  the  Tibaldi  and  inverse  ECMWF  methods,  which 
require  knowledge  of  the  surface  pressure,  the  surface  pressure  of  the  co¬ 
located  radiosonde  observation  was  used. 

A.l  Tibaldi  M  thod 

Equation  (1)  was  applied  to  four  humidity  layers  j-1,4,  where  the  layers 
are  defined  by  pj  <  p  <  Pj+^,  with  the  pj  and  the  coefficients  Mj ,  Aj  given  in 
Table  A.l. 


Table  A.l:  Parameters  of  the  Tibaldi  method 


j 
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PJ  1 

Mj  ' 

Aj  1 
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1 

| 

1 
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.15  | 
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p3  '  (P2"P5)/3  1 
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A. 2  AFGWC  method 

This  method  is  based  on  empirical  relationships  between  the  cloud  cover 
and  the  condensation  pressure  spread  (CPS),  which  are  defined  for  the  man¬ 
datory  pressure  levels  850,  700,  500,  and  300  hPa  (Table  A. 2;  see  Norquist, 
1988,  for  a  graphical  display  of  the  curves).  We  used  the  850  hPa  curve  at 
1000  hPa  and  the  average  of  the  500  and  300  hPa  curves  at  400  hPa .  The  CPS  is 
related  co  the  dew-point  depression  (DPD)  through  the  approximate 
relationship : 
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DPD  -  CPS  /  (ac  +  a1(p/pQ)  +  a2(p/p0)2)  , 

where  po-1000  hPa,  aQ-4.9  hPa/K,  aj-,93  hPa/K,  and  a?-9  hPa/K.  To  convert  the 
dew  point  depression  to  relative  humidity,  the  temperatures  of  the  colocated 
radiosonde  were  used. 
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Table  A. 2 


Values  of  condensation  pressure  spread  (CPS,  in  hPa)  as  a  function  of  cloud 
cover,  for  the  mandatory  pressure  levels  at  850,  700,  500,  and  300  hPa .  The 
entry  in  a  particular  row  and  column  of  these  tables  is  the  CPS  corres¬ 
ponding  to  the  cloud  cover  (in  %)  obtained  by  adding  the  labels  of  the  row 
and  column;  e.g.,  the  entry  in  the  first  column  of  the  second  row  corres¬ 
ponds  to  11%  cloud  cover. 
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A.  3  Inverse  ECMWF  method 

Equation  (2)  was  inverted  to  yield 

RH  -  RHC  +  rj1/2  /  (1  -  RHC) 

where  RH  (relative  humidity)  and  rj  (cloud  cover)  are  dimensionless,  ar,d  RHc , 
the  critical  RH  is  given  by 

RHC  -  1  -  aa‘  (1-a ' )  (1  +  £(<7'  -1/2)  )  . 

1  /2 

with  o’-p/ps,  a-2,  and  0-3  '  . 

Appendix  B:  The  3DNEPH  data  base 

The  3DNEPH  analysis  as  described  in  Fye  (1978)  provides  the  following 
data  at  each  grid  point:  terrain  height  (in  tens  of  meters),  total  cloud 
cover  (to  the  nearest  percent),  cloud  cover  for  15  layers  (to  the  nearest  5 
percent),  minimum  cloud  base  and  maximum  cloud  top  (  in  WHO  code  1677), 
cloud  type  for  low,  middle,  and  high  clouds  (in  AFGWC  codes),  and  present 
weather  (in  WMO  code  4677,  divided  by  10).  The  vertical  structure  of  the 
3DNEPH  layers  is  shown  in  Table  B.l. 


Table  B.l 

Vertical  Structure  of  the  3DNEPH  layers. 
Adapted  from  Fye  (1978) 
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Of  these,  all  but  the  present  weather  information  were  used  in  the  co- 
location  stud/.  Some  recoding  was  performed  to  allow  a  quantitative  analy¬ 
sis.  Cloud  cover  values  that  denoted  "thin  clouds"  were  set  to  zero  unless 
they  were  at  layers  13  or  above.  The  minimum  cloud  base  and  maximum  cloud  top 
codes  were  converted  to  heights  in  meters.  The  cloud  type  information  was  re¬ 
coded  such  that  missing  information,  i.e.  no  clouds,  was  assigned  a  value  of 
zero,  cumuliform  cloud  types  a  value  of  10,  mixed  cumuliform  and  stratiform 
cloud  types  a  value  of  15,  and  stratiform  cloud  types  a  value  of  20.  Ver¬ 
tically  compacted  cloud  cover  values  were  computed  assuming  maximum  overlap; 
the  3DNEPH  layers  assigned  to  the  compacted  cloud  cover  layers  are  shown  in 
Table  B.l.  Except  for  the  low  cloud  value,  the  vertically  compacted  cloud 
cover  values  were  computed  by  assigning  the  terrain- following  (AGL)  3DNEPH 
layer  cloud  cover  values  to  the  corresponding  constant  altitude  (MSL)  layers 
in  the  case  of  elevated  terrain.  In  order  to  compute  horizontal  averages,  the 
terrain- following  3DNEPH  layer  values  were  also  remapped  to  equivalent  con¬ 
stant  altitude  layers.  Horizontal  averages  and  standard  deviations  were 
computed  over  5x5  grid  points  centered  around  the  grid  point  closest  to  each 
colocated  radiosonde  station,  using  a  1-2-1  weighted  average  in  both  direc¬ 
tions.  For  the  colocation  study  data  was  extracted  from  3DNEPH  boxes  43,  44, 
45.  For  central  gridpoints  near  box  boundaries,  averages  and  standard  devia¬ 
tions  were  computed  over  fewer  than  25  points.  The  OSE  used  data  in  all 
3DNEPH  boxes  between  30°N  and  50°N. 
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Appendix  C:  Error  statistics  of  the  bogus  RH  data 

The  bogus  RH  data  was  used  in  the  moisture  01,  with  error  statistics  that 
were  estimated  from  the  independent  sample  of  the  colocation  study.  Different 
error  statistics  were  used  for  February  and  June.  The  observational  error 
standard  deviations  (o^)  were  estimated  from  the  RMS  colocation  errors  {aQ, 
viz.  Tables  6  and  8),  using  the  relationship  (Bergman,  1978,  eq  12): 


with  an  assumed  RAOB  error  (o^)  of  5%.  This  same  RAOB  error  was  used  in  the 
moisture  01. 

The  vertical  error  correlations  (Tables  C.l  and  C.2)  were  computed  by 
first  computing  a  colocation  error  covariance  matrix,  using  the  colocation 
errors  of  the  EOF  coefficients  and  the  RH  EOFs ;  the  assumed  RAOB  error  covari¬ 
ance  matrix  was  then  subtracted,  and  correlations  computed  from  the  resulting 
matrix.  RAOB  errors  were  assumed  to  be  uncorrelated  in  the  vertical, 
resulting  in  an  error  matrix  with  nonzero  elements  only  along  the  diagonal, 
with  a  magnitude  corresponding  to  a  5%  error.  The  colocation  error  covariance 
between  two  levels  k,i  is  related  to  the  EOF  coefficient  errors  by  the  fol¬ 
lowing  relationship: 


^r,kf  r,f) 


(fi  J 

h ,  m 


E  ,  E  t 
m ,  k  m,  £ 


m-1 


where  .  denotes  the  colocation  error  of  RH  at  level  k,  tc  „  the  colocation 
error  of  EOF  coefficient  m,  and  Em  ^  the  amplitude  of  EOF  m  at  level  k.  It 
was  assumed  here  that  the  coefficient  errors  of  two  EOFs  are  uncorrelated 
(which  is  strictly  true  only  for  the  dependent  sample) .  The  above  equation 
follows  from  the  property  of  the  EOFs  that  any  RH  profile  may  be  written  as: 


15 


RH,  -  RH,  +  )  E  .  , 

k  k  /  m  m ,  k 


m-1 


28 


TABLE  C.l 
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vertical  correlations  of  the  bogus  RH  data  for  the  15  sigma  levels  for  June. 
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where  RH^  is  the  climatological  mean  profile,  and  em  the  coefficient  of  the 
mth  EOF. 

Finally,  horizontal  error  correlation  functions  were  derived  from  colo¬ 
cation  errors.  Error  correlations  were  computed  for  distance  bins  of  200  km 
width.  Results  showed  anomalously  high  values  for  the  surface  level  out  to 
large  distances  which  were  caused  by  a  diurnally  varying  bias.  Because  of 
this  bias,  bogus  RH  values  at  the  surface  were  not  used,  and  curves  were 
fitted  to  error  correlation  values  that  were  averaged  over  the  remaining 
levels.  The  fitted  curves  were  of  the  form  exp(-d/k),  where  d  is  the  separa¬ 
tion  distance,  and  k  is  290  km  (230  km)  for  February  (June). 
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REL.  HUMIDITY  S.D. (PERC) 


3:  Mean  (a,c)  and  standard  deviation  (b,d)  relative  humidity  profil 

for  EOF-Bands  for  February  (a,b)  and  June  (c,d).  The  curve  labels 
(1,2,3)  refer  to  the  three  latitude  bands  (20°N-30*N,  SO'N-^O'N, 
40*N-50°N) . 


E0F  AMPLITUDE 

4:  Profiles  of  the  first  three  EOFs  for  EOF-All  for  February  (a)  and 

June  (b) . 
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Fig.  5:  As  Fig.  4,  but  for  EOF-Bands. 
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Fig.  7:  Contour  plots  of  RH  differences  between  NEPHSAT  and  STATSAT  analyses 

for  February  8,  06Z  (a),  February  8,  12Z  (b),  February  9,  00Z  (c),  and 
February  10,  12Z  (d).  Contours  are  drawn  every  25%,  the  zero  contour  is 
suppressed,  and  negative  values  are  dashed. 
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UtTHUOe  l  DECREES  )  IRTITUOE  (  DECREES  ) 


Fig.  8:  Contour  plots  of  3DNEPH  cloud  data  for  February  8,  12Z  (a), 

February  9,  00Z  (b) ,  and  February  10,  12Z  (c).  Contours  drawn  at  25,  50, 
and  75%. 
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Fig.  8:  Continued. 
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Fig.  10:  As  Fig.  8,  but  for  June  17.  06Z  (a),  June  17,  12Z  (b) ,  June  18,  00Z 
(c) ,  and  June  19,  12Z  (d) . 
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Fig.  10:  Continued. 
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Fig.  11:  Continued. 
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